Non-negative matrix factorization (NMF) is the problem of determining twonon-negative low rank factors $W$ and $H$, for the given input matrix $A$, suchthat $A \approx W H$. NMF is a useful tool for many applications in differentdomains such as topic modeling in text mining, background separation in videoanalysis, and community detection in social networks. Despite its popularity inthe data mining community, there is a lack of efficient parallel software tosolve the problem for big datasets. Existing distributed-memory algorithms arelimited in terms of performance and applicability, as they are implementedusing Hadoop and are designed only for sparse matrices. We propose a distributed-memory parallel algorithm that computes thefactorization by iteratively solving alternating non-negative least squares(NLS) subproblems for $W$ and $H$. To our knowledge, our algorithm is the firsthigh-performance parallel algorithm for NMF. It maintains the data and factormatrices in memory (distributed across processors), uses MPI for interprocessorcommunication, and, in the dense case, provably minimizes communication costs(under mild assumptions). As opposed to previous implementations, our algorithmis also flexible: (1) it performs well for dense and sparse matrices, and (2)it allows the user to choose from among multiple algorithms for solving localNLS subproblems within the alternating iterations. We demonstrate thescalability of our algorithm and compare it with baseline implementations,showing significant performance improvements.
展开▼
机译:非负矩阵分解(NMF)是确定给定输入矩阵$ A $的两个非负低秩因子$ W $和$ H $的问题,从而使$ A \ W W $。 NMF是在不同领域中的许多应用程序的有用工具,例如文本挖掘中的主题建模,视频分析中的背景分离以及社交网络中的社区检测。尽管它在数据挖掘社区中很受欢迎,但是仍然缺少有效的并行软件来解决大型数据集的问题。现有的分布式内存算法在性能和适用性方面受到限制,因为它们是使用Hadoop实现的,并且仅适用于稀疏矩阵。我们提出了一种分布式内存并行算法,该算法通过迭代求解$ W $和$ H $的交替非负最小二乘(NLS)子问题来计算因式分解。据我们所知,我们的算法是NMF的第一个高性能并行算法。它将数据和因子矩阵保存在内存中(分布在处理器之间),使用MPI进行处理器间通信,在密集情况下,可证明地最小化了通信成本(在温和的假设下)。与以前的实现相反,我们的算法也很灵活:(1)它对于密集和稀疏矩阵表现良好,(2)它允许用户从多种算法中进行选择,以解决交替迭代中的localNLS子问题。我们演示了算法的可扩展性,并将其与基线实现进行了比较,显示出显着的性能改进。
展开▼